Search CORE

7 research outputs found

A New Framework for Join Product Skew

Author: Afrati Foto
Kyritsis Victor
Lekeas Paraskevas V.
Souliou Dora
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Different types of data skew can result in load imbalance in the context of parallel joins under the shared nothing architecture. We study one important type of skew, join product skew (JPS). A static approach based on frequency classes is proposed which takes for granted the data distribution of join attribute values. It comes from the observation that the join selectivity can be expressed as a sum of products of frequencies of the join attribute values. As a consequence, an appropriate assignment of join sub-tasks, that takes into consideration the magnitude of the frequency products can alleviate the join product skew. Motivated by the aforementioned remark, we propose an algorithm, called Handling Join Product Skew (HJPS), to handle join product skew

arXiv.org e-Print Archive

DSpace at NTUA

Joint Institute for Nuclear Research (JINR)

Improved Methods for Extracting Frequent Itemsets from Interim-Support Trees

Author: Aris Pagourtzis
Dora Souliou
Frans Coenen
Paul Leng
Wojciech Rytter
Publication venue
Publication date: 04/12/2008
Field of study

Mining association rules in relational databases is a significant computational task with lots of applications. A fundamental ingredient of this task is the discovery of sets of attributes (itemsets) whose frequency in the data exceeds some threshold value. In previous work [9] we have introduced an approach to this problem which begins by carrying out an efficient partial computation of the necessary totals, storing these interim results in a set-enumeration tree. This work demonstrated that making ∗ Aris Pagourtzis and Dora Souliou were partially supported for this research by “Pythagoras

CiteSeerX

DSpace at NTUA

Community Detection via Neighborhood Overlap and Spanning Tree Computations

Author: Kulkarni Ketki
Pagourtzis Aris
Potika Katerina
Potikas Petros
Souliou Dora
Publication venue: SJSU ScholarWorks
Publication date: 28/04/2019
Field of study

Most social networks of today are populated with several millions of active users, while the most popular of them accommodate way more than one billion. Analyzing such huge complex networks has become particularly demanding in computational terms. A task of paramount importance for understanding the structure of social networks as well as of many other real-world systems is to identify communities, that is, sets of nodes that are more densely connected to each other than to other nodes of the network. In this paper we propose two algorithms for community detection in networks, by employing the neighborhood overlap metric and appropriate spanning tree computations

SJSU ScholarWorks

Frequent itemsets mining

Author: Souliou Dora
Σούλιου Θεοδώρα
Publication venue: 'National Documentation Centre (EKT)'
Publication date: 01/01/2006
Field of study

Hellenic National Archive of Doctoral Dissertations

Dimensionality Reduction of Accident Databases for Minimal Tradeoff in Prediction Accuracy

Author: Chalikias Miltiadis
Gregoriades Andreas
Souliou Dora
Tambouratzis Tatiana
Publication venue
Publication date: 01/01/2010
Field of study

Ktisis

Combining probabilistic neural networks and decision trees for maximally accurate and efficient accident prediction

Author: Chalikias Miltiadis
Gregoriades Andreas
Souliou Dora
Tambouratzis Tatiana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

The extent to which accident severity can be predicted from accident-related data collected at a variety of locations is investigated. The 2005 accident dataset brought together by the Republic of Cyprus Police is employed; this dataset comprises 1407 records of 43 continuous and categorical input parameters and a single categorical output parameter representing accident severity. No transformation of the database has been opted for, either by extracting the parameters that are significant for the prediction task or by modifying the records in any way (e.g. via record selection or transformation). Aiming at maximally accurate and efficient prediction, a combination of probabilistic neural networks (PNN's) and decision trees (DT's) is implemented: the simple training and direct operation of the PNN is complemented by the hierarchical, exhaustive and recursive construction of the DT. By training pairs of PNN's on data from the partitions derived from the minimal necessary number of top DT nodes, both efficiency and accident prediction accuracy are maximized. © 2010 IEEE

Ktisis

DSpace at NTUA

Chalmers Research

Chalmers Publication Library

Maximising Accuracy and Efficiency of Traffic Accident Prediction Combining Information Mining with Computational Intelligence Approaches and Decision Trees

Author: Chalikias Miltiadis
Gregoriades Andreas
Souliou Dora
Tambouratzis Tatiana
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2014
Field of study

The development of universal methodologies for the accurate, efﬁcient, and timely prediction of trafﬁc accident location and severity constitutes a crucial endeavour. In this piece of research, the best combinations of salient accident-related parameters and accurate accident severity prediction models are determined for the 2005 accident dataset brought together by the Republic of Cyprus Police. The optimal methodology involves: (a) information mining in the form of feature selection of the accident parameters that maximise prediction accuracy (implemented via scatter search), followed by feature extraction (implemented via principal component analysis) and selection of the minimal number of components that contain the salient information of the original parameters, which combined bring about an overall 74.42% reduction in the dataset dimensionality; (b)accidentseveritypredictionviaprobabilisticneuralnetworksandrandomforests,both of which independently accomplish over 96% correct prediction and a balanced proportionofunder-andover-estimationsofaccidentseverity. Anexplanationofthesuperiority of the optimal combinations of parameters and models is given, as is a comparison with existing accident classiﬁcation/prediction approaches

Biblioteka Nauki - repozytorium artykuÅÃ³w

Ktisis